DOCUMENT RESUME 

ED 336 415 TM 017 183 



AUTHOR 
TITLE 

INSTITUTION 



PUB DATE 
NOTE 

PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Cooley, William V. 

Testing and School Improvement. Policy Paper Number 

9. Pennsylvania Educational Policy studies. 

Pittsburgh Univ., Pa. Learning Research and 

Development Center.; Pittsburgh Univ., Pa. school of 

Education. 

3 Jun 91 

21p. 

Viewpoints (Opinion/Position Papers, Essays, etc.) 
(120) 

MF01/PC01 Plus Postage. 

* Academic Achievement; Educational Assessment; 
Educational Trends? Elementary Education; "Elementary 
Schools; Longitudinal studies; -Predictor Variables; 
Socioeconomic Status; -state Programs; -Statistical 
Data; Student Evaluation; Testing Programs; -Test 
Results 
-Pennsylvania 



ABSTRACT 

Five years of state test results for Pennsylvania are 
examined, focusing on differences among elementary schools in which a 
majority of students do not master basic learning skills. Data are 
provided for 1,505 elementary schools in about 500 districts. The 
largest learning problems occur in about 10% of the schools; these 
schools have been consistently low for the past 5 years. More 
specifically, for about 150 of the schools, over half of the students 
score below minimum competence every year. The data reveal a stable 
set of relationships among schools and between school achievement and 
other factors. Socioeconomic differences of families are the major 
predictors of achievement differences among schools. Currently, 
student/teacher ratios are largely unrelated to students' needs or 
achievement growth. Research to establish the student achievement 
benefits of higher "per student spending" is difficult, since most 
spending variation among the 500 school districts in Pennsylvania is 
due to differences in teacher salaries or student/teacher ratios. 
Increasing teacher salaries could increase the quality of new 
teachers hired in the 1990 »s if valid teacher quality indicators are 
available and used by districts. Smaller classes could help if 
instruction is designed to take advantage of smaller class sizes and 
if use of smaller classes is managed. Four tables and two figures are 
included. Eight reports in the Pennsylvania Educational Policy 
Studies series are listed. (RLC) 



* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 



4 



Pennsylvania Educational Policy Studies 



PEPS Is a joint effort of the U. of Pittsburgh's School of Education and the Learning Research and Development Center 

This is policy paper number 9 in this series 



U & OEPAATMCNT Of EDUCATION 

CWk e cJ I rtuc atonal Resean n and tmpto*«mp"' 
F DuCATfONAl RF50URCF5 INFORMATION 
CtNTFR «F.RfC» 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



*ep«oduc*d 85 



M«»-Ot < nances P*ftve tH>»n "^ad* 1 *r> »mpf f »*f 
r ep*odut t«on qua«»!* 



*-\>nts P» (>» Op*nu)«5 5l3ie(J' n thtftdoc ^ 
mpot di> *0» n*<e*W<ly *epff*SpM <>«»< >a' 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERfO" 



Testing and School Improvement 



by 

William W. Cooley 
University of Pittsburgh 



June 3, 1991 



The purpose of this series of papers is to contribute to a more informed debate about critical 
policy issues facing Pennsylvania's public schools. This PEPS series draws upon a data base 
that has been established here at the University of Pittsburgh under the direction of William 
Cooley in cooperation with the Pennsylvania Department of Education. 



ERIC 



Reactions can be shared: 

by mail: LRDC. Pgh.,PA 15260 
by PittVAX: COOLEY 
by FAX: 412-624-7088 
by PENN'LiNK: PEPS 



by phone: 412-624-7085 

by BITNET:COOLEY<§>PITTVMS 

by chat: room 743, LRDC 



EST COPY AVAILABLE 



Executive Summary 

This paper examines five years of state test results 
(TELLS), focusing upon differences among Pennsylvania's 
elementary schools, particularly those schools with a 
majority of their students not mastering essential 
learning skills. 

The largest learning problems occur in about 10% of 
the schools, and those schools have been consistently low 
for the past five years. Making public the test results 
of low scoring schools has not solved the problem. 

The achievement differences among schools is almost 
completely predictable from the socio-economic 
differences of the families being served by these 
schools. This is not news, but the persistent, dominant 
importance of home factors must be confronted if the 
situation is to improve. 

Simple but costly solutions, such as raising teacher 
salaries or reducing pupil/teacher ratios, do not look 
promising if that is all that is done. In fact, the 
lower scoring schools currently tend to have slightly 
better paid teachers and slightly smaller pupil teacher 
ratios. 

Increasing teacher salaries could be helpful in 
increasing the quality of the thousands of new teachers 
to be hired in the '90's if valid teacher quality 
indicators were available and used by districts in that 
hiring. Higher salaries can increase the quality of the 
applicant pool, but that will not necessarily increase 
the quality of the teaching staff if teachers are hired 
for reasons other than their ability to teach. 

Smaller classes can be helpful if instruction is 
designed to take advantage of it, and if use of smaller 
classes was managed, and didn't just happen. Currently 
pupil/teacher ratios are largely unrelated to student 
needs or to student achievement growth. If it is business 
as usual, the possible positive achievement effect of 
smaller classes will continue to be elusive. 

These results illustrate why it has been so 
difficult for research to establish the student 
achievement benefits of higher per pupil spending, since 
most spending variation among our 500 school districts is 
due to differences in teacher salaries or pupil/teacher 
ratios . 



o 

ERIC 



3 



Testing and School Improvement 
William W. Cooley 

As the debates rage about how to reform, restructure 
and retest the nation's schools, it seems useful to 
examine what can be said about these issues based upon 
actual data from one state's testing program. This 
paper examines the results of testing in Pennsylvania's 
elementary schools between 1986 and 1990, and seeks 
implications for some of the current debates. The 
results are derived from the state database established 
as part of the Pennsylvania Educational Policy studies. 

The student performance information is based upon a 
Test of Essential Learning and Literacy Skills (TELLS), 
which has been administered to all students in grades 3, 
5 and 8 since 1985. The indicator of school performance 
used here is the percent of the students in the school 
that score below the minimum cut-score established by the 
state. Included in these analyses are the 1,505 public 
elementary schools that served both 3rd and 5th graders 
between 1986 and 1990. 

Figure 1 illustrates how that school performance 
indicator distributes for grade 5 reading in 1989, for 
example. As seen there, most of the schools had fewer 
than 30% of their students score below the cut score, but 
about 150 schools had more than 50% of their students 
score below the minimum (the schools in the right hand 



2 





Distribution of Schools 




Grade 5 Reading, 1989 


1 

350 
300 


Number of Schools 










250 










200 






150 


u. .1 , ... 








100 






50 








lilt ^ — 




0 20 40 SO 80 100 

Percent of Students Below Cut Score 


N • 1806 Softool* 



Figure 1 



"tail" of the distribution) . Similar results were found 
for math and for all fi^e years of TELLS testing. This 
paper explores why some schools seem to do so poorly, why 
it is serious, and what micjht bo done to improve things. 

It must be emphasized at the outset that this paper 
is looking at schools, not individual students. Thus 
the relationships reported here are among schools, not 
students. For example, schools with a large percentage 
of students from homes on welfare tend to have large 
numbers of students performing low on the TELLS test. 
That school level relationship is significantly stronger 
than is the student level relationship between home 



0 

ERIC 



5 



3 

status and how individual students perforin on the TELLS 
test. That is, it is much easier to predict the TELLS 
performance for a school than for a student, given home 
backgrounds. This point is important when considering 
what can be said about the effectiveness of a school, 
given school achievement results. 

It is also important to justify the school 
performance measure used here, which is the percent of 
students in the school who scored below the minimum cut 
score. The TELLS questions were the essential learning 
and literacy skills in reading and mathematics which 
committees of educators agreed all students should have 
mastered by that grade level. A student was below the 
cut score if more than about one-third of the questions 
were answered incorrectly. Even if one believes that the 
cut scores are arbitrary, that does not invalidate the 
kind of correlational analyses reported here. Changes in 
the cut score does not change the relative rankings of 
the schools. In fact, the relationships among schools 
based upon these established cut scores are identical to 
those that are obtained using school means as the school 
performance measure. 

Students have great difficulty learning from 
subsequent schooling if these basic skills are not 
mastered. For example, fourth grade instruction assumes 
the students possess the grade 3 TELLS skills. High 



o 

ERIC 



6 



schools assume these skills have been mastered. 
Employers require them. Having an entire school at risk 
because most of the students in that school do not know 
these skills is not acceptable. 

Other papers in this PEPS series have shown why the 
TELLS test should be replaced by a more expropriate state 
test for purposes of accountability, curriculum reform, 
and informing state policy (Cooley & Bernauer, 1990 and 
Cooley, 1990). For example, TELLS measures too narrow a 
spectrum of curriculum goals to be the basis for 
determining what is to be taught. But the available 
TELLS results do allow us to examine several important 
current issues, and it seems reasonable to figure out 
what can be learned from the available test results 
before moving on to some new enthusiasm. People often 
call for a new test because we didn't learn much from the 
last test, but the implications of test results are never 
self evident, and we will continue to learn little or 
nothing from them if little or no analyses are conducted. 
TELLS does measure one aspect of what schooling is about, 
so let's see what we can learn. 
Stability of School Level Results 

The degree to which a school would be similarly 
ranked from one testing to another can best be described 
using correlations, where a correlation of 1.00 indicates 
identical rank orderings. Table 1 shows that school 



Table 1 

Correlation Between Reading and Math 



Year 



Grade 3 



Grade 5 



1986 
1987 
1968 
1989 
1990 



.87 
.88 
.90 
.88 
.92 



.86 
.86 
.88 
.88 
.87 



Average * .88 



N ■ 1505 Schools 



rankings on student performance in reading and math are 
very similar for grades 3 and 5, and for all five years. 
That is, no matter what grade or year one examines, 
schools tend to have the same rank order whether one 
ranks reading or ranks math results. The average 
correlation in Table 1 is .88, with a range of .86 to 
.92, a very stable set of relationships. 

Table 2 reveals another remarkable stability in 
school performance. The correlations reported there show 
the stability of school performance between 3rd and 5th 
grade. For example, third graders in spring 1986 
represent the same cohort of students that were in grade 
5 in 1988, and reading performance at grade 3 correlates 
.82 with reading performance at grade 5. Therefore most 



ERIC 



of the variation 



in school Table 2 

differences at Correlations for Same Cohort 

grade 5 is 



Year of 



predictable from 
their 
differences at 
grade 3. Notice 
that reading is 



Grade 


3 Grade 5 


Read 


Math 


1986 


1988 


.82 


.72 


1987 


1989 


.85 


.75 


1988 


1990 


.81 


.68 



more stable than math, but that both are very stable for 
the three cohorts available. 
School Performance and the Home 

Another indicator that is available for each school 
is the percent of the students that are from low Income 
homes, particularly from families on welfare. Table 3 
reports the high correlation that exits between the 
percent of students below the TELLS cut-score and the 
percent of the students from low income families, for 
both grades for the five years. The average correlation 
for reading is .73, and for math it is .63. The sguare 
cf the reading correlation indicates that 53% of the 
variation in reading performance among these 1,505 
schools is associated with this home socio-economic 
status (SES) indicator, and 40% for math. That is, 
reading performance is more dependent upon home factors 
than is math. 



Table 3 

% Below Cut and % Low Income 



Year Grade Read Math 



1986 


3 


.69 


.60 




5 


.73 


.62 


1987 


3 


.71 


.61 




5 


.74 


.60 


1988 


3 


.70 


.62 




5 


.73 


.61 


1989 


3 


.75 


.65 




5 


.74 


.63 


1990 


3 


.74 


.70 




5 


.76 


.66 



Average Correlation .73 .63 

N • 1505 Schools 



There are a few factors that affect the strength of 
the Table 3 relationships that should be mentioned. One 
such factor is the degree of overlap between the specific 
items in the TELLS test and the specifics of what was 
taught in each of these schools. As this overlap 
decreases, success on the test depends more and more on 
what is learned outside of school. Since there is no 
standard state curriculum, the resulting variation in 
overlap makes test performance more dependent upon home 
factors. This helps to explain why reading skills are 
more dependent upon home differences than are math 



ERJ.C 



8 

skills, since the latter are less likely to be learned 
outside of school. 

Also, the difficulty of a test affects its 
relationship with home factors. As more difficult items 
have been added to the TELLS test over time, there is 
better discrimination among schools with more able 
students, with correlations increasing slightly over 
time, as seen in Table 3. 

A third factor that affects the home -achievement 
relationship is the degree to which schools are 
homogeneous with respect to home SES. The more schools 
differ in their average demographic makeup, the higher 
will be the relationship between SES and school 
achievement. If, for example, schools were completely 
integrated with respect to SES, the strong relationships 
of Table 3 would not be found. 

The very slight differences in these correlations 
within each column of Table 3 does reveal a high degree 
of consistency from year to year. It certainly shows 
that these indicators are not full of random error from 
grade to grade or year to year. But it also seems 
important to try to understand the implications of these 
trends for state testing programs and for school reform. 

One issue is whether the home influence persists as 
students move up the grades, or whether home SES simply 
affects the school readiness skills and other 



predispositions with which students begin school. This 
question was examined using a technique called structural 
modeling, which allows one to examine whether the SES 
factor is relevant to 5th grade achievement, given their 
third grade performance. This analysis was possible 
because the PEPS data base included longitudinal data 
that were based upon the same students who were 3rd 
graders in 1987 and fifth graders in 1989. 



Structural Model 

SES and School Achievement 



Read 



Math 




Minority 



Low Income 



Figure 2 

Figure 2 illustrates a simple structural model. As 
indicated there, the SES factor is measured by the 
percent minority and the percent low income in the 
school. Grade 3 is a school-achievement-deficits factor 
indicating the degree to which students are scoring below 



o 

ERIC 



12 



10 

minimum on both reading and math. The Grade 5 factor 
indicates the degree to which this was true for these 
same students in fifth grade. The SES factor correlates 
.91 with Grade 3 and .94 with Grade 5. Grade 3 
correlates .86 with Grade 5. The main finding is that 
the SES factor continues to have a very strong direct 
effect between grade 3 and grade 5 school achievement. 
This means that whatever those processes are that result 
in strong relationships between SES and school 
performance, they continue as students move up the 
grades. Put another way, school differences in 5th grade 
achievement are even more dependent upon home differences 
than they were at 3rd grade. [The technical details of 
the structural modeling of these school level data will 
be provided in a forthcoming PEPS report. ] 

Another sobering aspect of the Figure 2 analysis is 
the fact that this simple model explains 90% of the 
variation in 5th grade school achievement differences. 
That means that only 10% is left for all the different 
ways in which schools might be different: large or small, 
an effective or ineffective principal, curriculum similar 
to or different from the test, or the hundreds of other 
innovations, model schools, etc. that are out there. 

This does not mean that the situation is hopeless 
for schools serving low SES students, but it makes it 
very clear that the problem is not going to solved by 



13 



11 

giving more tests. The 150 schools that were "in deep 
trouble" in 1986 are pretty much the same schools that 
are scoring low on the test today. Five years of public 
accountability (e.g., publishing school results in the 
local newspapers) has not made a difference. It makes it 
clear that reducing the large number of students not 
mastering basic skills in some schools is going to be 
very difficult. 

It is possible in this type of analysis to give 
homes undo credit or blame. Better teachers may be more 
likely, or good school practices more likely, in schools 
serving students from higher SES homes. If school 
practices or resources are correlated with home SES, then 
the Figure 2 analysis would be unfairly attributing to 
homes some of the credit or blame that was due the 
schools. 

Another PEPS report (Cooley, 1990) established that 
the major ways in which high-expenditure-per-pupil 
districts differed from low spenders is that they paid 
teachers more and hired more teachers relative to their 
student enrollment. So let's next examine how these 
schools differed in pupil/teacher ratio (P/T) and average 
teacher salary. Table 4 summarizes these correlations. 

For example, looking down the first column of Table 
4, Read3 is the percent of students below cut on the 
grade 3 TELLS reading test for 1987. That variable 



9 

ERIC 



14 



12 

correlates .85 with 1989 grade 5 reading, as was shown in 
Table 2, and correlates .69 with percent low income . as 
in Table 3. The percent minority correlate* .72 with 
Read3, about the same as percent low income. The .13 
with teacher salaries indicates a very slight tendency 
for higher average teacher salaries to be associated with 
more third graders below the cut suore, while the -.21 
with P/T (pupils per teacher) indicates that schools with 
more poor readers have slightly fewer pupils per teacher. 
Let's look at those last two relationships more 
carefully. 



Table 4 

Correlations Among Schools 





Reads 


Reads 


Lowlnc 


Mlnrty 


TchSal 


P/T 


Read3 


1.00 


.85 


.69 


.72 


.13 


-.21 


Reads 


.86 


1.00 


.74 


.74 


.16 


-.19 


Lowlnc 


.69 


.74 


1.00 


.68 


.00 


-.18 


Mlnrty 


.72 


.74 


.68 


1.00 


.30 


-.21 


TchSal 


.13 


.15 


.00 


.30 


1.00 


-.16 


P/T 


-.21 


-.19 


-.16 


-.21 


-.16 


1.00 



N - 1609 Elementary Schools 



The average elementary school in Pennsylvania has 20 
pupils for each teacher in the school. But the variation 



15 



in P/T is large. Five percent of the schools have a P/T 
less than 15, and in the top five percent, the P/T is 
greater than 26. But this variable exhibits very low 
correlations with other characteristics of the schools 
(except, of course, costs per pupil). The slight 
tendencies that do exist (the P/T column in Table 4) 
indicate that lower P/T are slightly associated with 
schools that serve more low income homes, in part because 
the low SES schools have extra teachers supported by 
compensatory education funds. So if smaller P/T implies 
a more effective school, that does not explain away the 
strong home effect of Figure 2. in fact it tends to work 
against it. Students from higher SES homes tended to 
attend schools with slightly more pupils per teacher, as 
seen in Table 4. These results do not refute the 
possibility that some kinds of instruction with some 
kinds of students would be facilitated by smaller 
classes, but the results do indicate that low SES schools 
are not handicapped because their classes tend to be 
larger than in high SES schools. That is clearly not the 
case. 

Turning to teacher salaries, the range in average 
salaries for the elementary schools in 1989 was from 
$18,900 to $47,000, with the average school paying an 
average salary of $30,000. So there is about a $28,000 
difference between the lowest paying school and the 



16 



14 

highest. But this average teacher salary correlates zero 
(that's 0.00 1) with percent low income students. So the 
big effect we find for the hone is not because schools 
serving high SES hones pay their teachers more. 

It is possible that higher salaries or lower P/T 
ratios night at least explain inprovenent in achievement 
test performance, even though they are unrelated to SES. 
So far ny many attempts at finding a significant salary 
or P/T effect have failed. The variation in P/T or 
average salary among these 1,505 schools does not explain 
the variation in percent below the cut-score, controlling 
for other possible factors, including prior performance. 
Although a previous PEPS study (Cooley, 1990) reported a 
slight teacher salary effect at the district level, it 
seems to wash out when exanined at the school level. 

The dynamic that leads one to expect a connection 
between teacher salaries and student achievement goes 
like this: higher salaries attracts a larger, more 
qualified pool of applicants; if a district pays 
teachers more, they will also be able to retain their 
better teachers; better teachers will produce better 
students. One place where this may be breaking down is 
in establishing what is a "better" applicant. I was 
recently talking with a group of district superintendents 
about how they selected which teacher to hire from a 
large pool of applicants. The responses had more to do 



17 



15 

with their Board's preferences for demographics than 
anything else. For example, one superintendent reported 
that out of 150 applicants for one position, he would 
have to select from among those few teacher applicants 
who actually lived in that school distric* . Even if they 
could hire based upon some quality indicators, they did 
not seem to have any criteria for distinguishing among 
teachers in terms of expected teacher effectiveness. 
Also, teachers who are good at integrating art into the 
curriculum, tor example, may not be good at figuring out 
how to teach basic skills to low SES students. 
Summary and Conclusions 

Five years of TELLS results reveals a very stable 
set of relationships among schools and between school 
achievement and other factors, particularly factors 
associated with the homes being served by a school. It 
is also clear that there are many schools in which a 
majority of the students are not learning those basic 
skills that are needed to profit from subsequent 
schooling, or needed for employability. For example, 
there are about 150 schools in Pennsylvania in which, 
year after year, over half of the students score below 
minimum competence. 

The persistent, dominant importance of home factors 
has been established and re-established for at least the 
past 25 years. It is not exactly news. The initial 



18 



16 

individual differences which children bring to school 
seem to determine the achievement differences among 
schools in the early grades, and those differences 
persist. So what is to be done? 

There is no reason to be optimistic about some of 
the quick fixes often proposed; for example, better 
tests, higher teacher salaries, or smaller classes. 
Teacher salary differences and pupil/teacher ratios are 
the main differences among districts with different 
spending levels, so if what districts do when they have 
more money is raise salaries or hire more teachers, then 
more money alone will probably not raise the basic skills 
levels of these low SES schools, unless many other things 
are changed as well, such as the kinds of teacher quality 
information districts have and use when they hire new 
teachers . 

What has impressed me the most as I have worked with 
these state-wide data is how interdependent the many 
variables seem to be. We are truly dealing with a 
massive system. Attempts at trying to manipulate one 
aspect of the system externally (e.g. establishing 
minimum teacher salaries) may not have the intended 
effect. Systems often react to intrusions in counter- 
intuitive ways. 

There have been several proposals for improving the 
basic skills of students from low SES homes: (1) find 

ERIC 



17 

ways to improve home processes that are re?&vant to 
school learning; (2) invest in preschool experiences 
that increase school readiness skills and predispositions 
for low SES students; (3) figure out how to make schools 
more adaptive to the differences which students bring to 
school. But evaluating those proposals goes beyond the 
scope of this paper. We have looked at the pre-school 
effect, and it is clearly a positive influence. James 
Bernauer will be reporting on this in a forthcoming PEPS 
report. 

What can be said with confidence about systems is 
that manipulations of single variables may not produce 
the desired effects. Systems do not change that easily 
or in easily predicted ways. Some argue that we need to 
modify the power relationships within the system before 
we can expect significant improvements. Others want to 
apply principles from the marketplace to force school 
improvements or close them down. One thing that seems 
clear is that we need to learn more from our educational 
research efforts if we are to effectively guide the many 
reform efforts being proposed. This modest PEPS effort 
is working toward that end. 



♦ 



18 

Previous PEPS Reports 



1. The Public Schools and Regional Economic Change by 
William W. Cooley and Maureen W. McClure, September 
1989. 

[focuses upon southwestern Pennsylvania] 



2. Inequalities and Inequities in the Pennsylvania 

Public Schools by William W. Cooley, November, 1989. 

[primarily concerned with the inequities in school 
finance and t*e variations in student performance] 

3. School Comparisons in State Wide Testing Programs by 
William W. Cooley and James A. Bernauer, February, 1990. 

[explains why school comparisons of student 
achievement are usually misleading and how to make them 
more interpretable] 

4. Important Variations Among Pennsylvania School 

Districts by William W. Cooley, September, 1990. 

[describes how financial data, teacher data, and 
student achievement are interrelated] 

5. The Development of a Multilevel Model of State Level 
student Achievement by Lawrence Bernstein, October, 1990. 

[applies a new statistical model for analyzing 
multilevel state data — district level and student level] 

6. Student Assessment in Pennsylvania by William W. 
Cooley, December, 1990. 

[recommends a statewide student assessment system to 
replace the present TELLS test] 

7. Confidentiality of Education Data and Data Access by 
William W. Cooley, April, 1990. [discusses the privacy 
issues involved in establishing research data bases] 

8. Fiscal Strain in Pennsylvania's School Districts by 
William w. cooley, Febru'ary, 1991. 

[analysis of five year trends in district revenues 
and expenditures] 



9 

ERIC 



